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1 Introduction 



The interest in statistical systems close to criticality is shared by a large 
community including condensed matter physicists and particle theorists. An 
important tool in the study of such systems are numerical experiments in 
the form of Monte Carlo simulations which complement analytical results 
that are available for special systems and limiting cases. One of the central 
problems in such simulations is the degradation of most known simulation 
algorithms as criticality — the cutoff or continuum limit in quantum field 
theories describing particles — is approached. The interdisciplinary search 
of improved techniques is an on-going effort, but it already has yielded some 
very positive results in recent years some of which we want to briefly re- 
view here. Our focus will be on spin systems with variables belonging to 
continuous manifolds (a-models) and on pure lattice gauge theory. This se- 
lection thus leaves out the vast field of discrete Ising-like systems^ as well as 
the enormous problems (and potential gains from algorithmic improvement) 
faced in QCD simulations with fermions.^ 

The problem of critical slowing down (CSD) near criticality is schemati- 
cally described by the dynamical scaling law 

r = cr- (1) 

Here ^ is some physical scale or correlation length in lattice units and r is a 
time scale in number of iterations. A r and corresponding z can characterize 
the time scale for equilibration or the rate at which statistically independent 
estimates for observables of interest are produced. Then for the expectation 
value of A, 

(A[s]) = i / Dse-^^[^(")1 A\sl (2) 

Zj J 

the accuracy in estimating A improves as 

5a = ^variance^ x 2r/# of iterations . (3) 

In (0) Ds = na d/i(s(x)) means integration over all fields or spins s{x) at 
lattice sites x with some measure, Z is the partition function and (3H[s\ is 
the inverse bare coupling or temperature times some action or Hamiltonian. 

The algorithms to be described here have been designed to lower z from 
its "traditional" value of about two for standard local Metropolis methods 
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to 2; ~ 1 or even 2; ~ in some cases. They thus improve the efficiency of 
simulations by one to two powers of ^. 

Beyond the introduction this article will continue with a section on the 
technique of embedded variables (common to many algorithms), and short 
descriptions of cluster algorithms, multigrid methods and overrelaxation fol- 
lowed by some conclusions. 

2 Updating Embedded Variables 

Let us imagine some group of transformations T G G that act on the config- 
urations s={s(x)}, 

s ^ Ts, (4) 

such that the measure is left invariant, Ds = DTs. Such transformations 
can be local, Ts = {t{x)s{x)}, and then t{x) is a field similar to s itself 
with values in local group factors that make up G. One may however also 
consider more global changes of s, where T acts for instance on some cluster 
of spins from s with one and the same rotation. For a given configuration s 
we can now think of a statistical system with configurations T E G and an 
effective or induced Boltzmann factor exp(— /3if[Ts]) whose simulation can 
also be considered. Assume now that we have a Monte Carlo algorithm for 
this system which is characterized by transition probabilities w{s\Ti T2) 
that preserve exp(— /3i7[Ts]) and depend on s only through the effective 
Boltzmann factor such that w{Ts;Ti — * T2) = w{s;TiT T2T) holds. 
Then, if ervalid algorithm for the original field s{x): 

• for momentarily fixed s = Si put Ti = Id as initial configuration, 

• update Ti T2 with the w-algorithm, 

• take S2 = T2S1 as a new s. 

To prove this assertion we ffist note that the overall transition probability 
is given by f\ 

Wisi ^S2) = J2 ^(s; ^ T) 5(32 - Tsi). (5) 

T 

^For simplicity we take a discrete group G here; in the continuous case the sum over 
all elements has to be replaced by an invariant group integration. 
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We have to show that W preserves the Boltzmann weight exp(— To 
this end we transform 



J Dsi e-^^[^il W{si S2) = 

I I ^ 'j^/ 

y^/DsiEe-^''^'^'^i'^(s2-r'si) = e-^^[^^l (6) 

To arrive at the third hne changes of variables si — > Tsi and T' —>■ T'T~~^ 
are carried out and T is averaged over. In the last line we absorbed T' into 
Si and the 5-function is used. 

In many cases, in particular if G is a lower dimensional manifold than 
the original configuration space, the moves described are not ergodic. This 
can often be improved by using families of different embeddings between 
which one switches in a random or deterministic order. In cases where this 
is not sufficient one can always blend in conventional update steps to achieve 
ergodicity. 



3 Cluster Algorithms 

Cluster algorithms^ for continuous spins are an example of the successful use 
ovariables. Their drawback has been up to now that the 0{n) invariant n- 
vector models are the only continuous variable a-models where they are pow- 
erful. [] Here however, according to accumulated numerical evidence^, they 
really achieve z ~ with even a very small coefficient c in (|ip. With the ad- 
ditional advantage of variance reduced estimators for Green functions, these 
models have become an ideal testing ground for nonperturbative physics^, in 
particular in two dimensions, where they are asymptotically free for n > 3. 
Also the xy-model (n = 2) is of great interest^ as it allows for checks of the 
Kosterlitz Thouless scenario. We therefore now specialize our general setup, 

t There are principal reasons for this hmitation which come close to a no-go theorem.^ 
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Eq.(0), to the 0{n) models, where 
s{x)eR", Ds = l[d''s{x) 6{s{xf -1), -H[s]= J2 s{x) ■ s{y). (7) 

X <xy> 

The key to efficient cluster algorithms is the embedding of Ising spins 
and the use of global update techniques for them. A family of embeddings 
is labelled by an n-component unit vector r of the same type as the local 

spins. The group involved is Gr — zf^^^^^ corresponding to one Ising spin 
a{x) = ±1 on each site, T {cr(a;)}. They act as reflections, 

s' = Ts, s'{x) = s±{x) + a{x)s\\{x), (8) 

where _L and || refer to the r-direction. The induced Hamiltonian 

— H[Ts] = r ■ s{x) r ■ s{y) a{x)(y{y) + terms independent of a (9) 

<xy> 

is recognized to describe a ferromagnetic Ising model with random bond 
strengths J^xy> = f ■ s{x) r ■ s{y). When they are multiplied along closed 
loops the result is always positive or zero due to their factorized form, which 
shows the absence of frustration. We also note that there is no magnetic 
field coupling to a. This can be regarded as being due to the reflections 
in (^) being part of the 0{n) global symmetry which leads to a global Z2 
invariance for a. For these embedded random systems the Swendsen-Wang 
algorithm*^ or its single cluster variant^ work very well, actually better than 
in the standard Ising model where some remaining CSD is still detectable. 

The algorithm just described is ergodic if r is chosen at random such that 
all directions can appear. In practical realizations it is usually convenient 
to always take r = (1,0, ... ,0). Then the embedding requires fewer oper- 
ations, and ergodicity is restored by globally 0(n)-transforming the whole 
configuration with a random rotation or reflection after a certain number of 
updates. 

4 Multigrid Techniques 

Multigrid Monte Carlo (MGMC) techniques have been proposed^ to elimi- 
nate CSD by allowing for efficient moves of a critical system on all scales. 
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They work well on nearly gaussian systems^" as they do in the related problem 
of linear difference equations where the method has made its first appear- 
ance. Here we will not review the truly recursive MGMC^^ approach but 
a simpler unigrid variant that was proposed recently. It has become the 
optimal method for cr-models other than the 0{n) family. 

Wc now present the method for the 0(3)-modcl.^^ The actual updates are 
performed here (and in realizations for other a-models) on embedded U{1) 
spins of the xy-model type. To an 0(3) generator corresponding to some 
rotation, as for example 

/ 1 \ 
a = -1 0, (10) 

V / 

we couple local angles a{x) by 

Ts={e-^"(^)^s(x)}. (11) 
This induces a Hamiltonian for a{x), 

- H[Ts] = ^ Re (j<a;j/>e^("(^)-°(^)) + terms independent of a (12) 

<xy> 

with complex bond strengths 

J<.y> = {s\x) + ts\x)){s\y) - ts\y)) = J*^,>, (13) 

where s",a = 1,2,3, denotes the components of s. The random bonds gen- 
erated for the embedded xy-model are again ferromagnetic due to their fac- 
torized form. As for U{1) gauge fields their orientation has to be properly 
taken into account, of course. 

We now need an algorithm for a{x). For the version of MGMC of Ref. 12 
elementary moves are performed on B x B x . . . x B subblocks of the lattice. 
For such blocks one has a priory fixed profiles of kernels K{x) that vanish 
outside the block and are smooth. Possible choices in arbitrary dimensions 
are pyramids or the lowest mode sine waves with the block as a cavity. The 
kernels appear in the nonlocal Metropolis proposals 

{a{x)} {a{x) + ^K{x)}. (14) 
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These are accepted or rejected in the usual way, and \1/ is drawn from a sym- 
metric distribution with a width adjusted for reasonable acceptance. For a 
lattice of length L, which should be a power of two, one has to hierarchi- 
cally cover the lattice with blocks of sizes B = L/2, L/A, . . . ,1 with the last 
choice producing just local updates. It has turned out to be important that 
either the superimposed block lattice or equivalently the field configuration 
is randomly translated between updates, such that the block boundaries are 
not at fixed positions. Furthermore the generator A is randomized to achieve 
ergodicity. 

For 0{n) models this MGMC algorithm is presumably inferior to clus- 
ter methods, although detailed comparisons would be somewhat hardware 
dependent. The importance of the technique derives however from the fact, 
that it can also be used for CP"-models and for SU{n) valued spins. The 
main change is that appropriate generators (from U{n) and SU{n) respec- 
tively) have to be substituted for A. The resulting embedded U{1) system 
now can have frustration depending on the configuration of the "background" 
spins s. Practical tests for the S'[/(3) x S'f/(3) chiral model and for the CP^ 
system have shown that these frustrations do not seriously slow down the 
evolution. In all three cases z = 0.2(1) has been found. 

5 Overrelaxation 

In contrast to the two previous algorithms overrelaxation (OR) achieves an 
improvement (down to z ~ 1) with still local updates only, and hence it 
is as fast or even somewhat faster per sweep than standard algorithms. It 
also immediately carries over to gauge fields. We now present OR in its 
"hybrid" form that found many applications recently rather than the original 
"tunable" version^^. We consider the local update problem at site x with local 
Boltzmann weight 

g/3s(x).M(x) ^j^g^g ^^^^ ^ ^^y^^ ^^5^ 

y=n.n. of x 

again for an 0{n) model spin for illustration. A local heatbath procedure 
amounts to choosing a new s{x) independently of the old one with the weight 
(|T3|). For OR we need additional microcanonical steps producing transitions 
Si{x) S2{x) such that 
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• si{x),S2{x) have the same local weight ( |T5| ) and thus the energy is 
unchanged, 

• Si{x), S2{x) are as far from each other as possible. 
For our example this principle leads to the change 

M M ■ si(x) 

s,{x) = -s,{x) + 2 Jp^- (16) 

Experience has shown that both for vectorization and for the nonuniversal it 
is best to group the local updates such as to do a maximum number of in- 
dependent ones in parallel, which usually amounts to checkerboard ordering. 
Now an OR iteration consists of microcanonical sweeps followed by one 
heatbath or other standard ergodic step. From exact gaussian analysis^^ and 
from numerical experiments^^ it is known that to achieve 2; 1 it is necessary 
do let N grow proportional to C, as criticality is approached. Often N = ^ 
is a reasonably good trial value. The goal is to achieve a roughly constant 
autocorrelation time when measured in iterations which implies z ^ 1 when 
referring to sweeps. 

In particular, this kind of OR is the present method of choice for SU(2) 
lattice gauge fields (either fundamental or embedded to move SU(3) fields). 
The local problem for a link variable U^{x) G SU{2) coincides with the 0(4) 
case when SU{2) matrices are expanded in terms of the unit- and the Pauli- 
matrices. 

We close with the example of a recent simulation of the SU{2) gauge 
theory^'' where it was possible to determine the relation between r and a 
scale in lattice units for a whole range of scales as shown in Fig. 1. In this 
study a renormalized coupling constant was held fixed which is equivalent to 
a finite size scaling limit at fixed L\fK with the string tension K assuming 
the role of a correlation length. The time r refers to independently estimating 
the renormalized coupling. The line in the plot represents a fit with the form 
(|lD giving z = 1.0(1) and c = 0.5(1). For further details on algorithmic and 
physical aspects of this work we have to refer to Ref. 17. 

6 Conclusions 

We have presented some of the accelerated algorithms for the Monte Carlo 
simulation of spin and gauge theories that have been discovered and tested 
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Figure 1: Autocorrelation time in sweeps for four dimensional SU{2) lattice 
gauge theory in a finite size scaling limit. 

in recent years. As a result, critical continuum behavior can be studied now 
much more accurately, especially in two and three dimensions. In four di- 
mensions, which is the most interesting case for particle physics, the situation 
will become similar as larger systems will be studied on future computers. 
In the presence of Goldstones modes, the new techniques are crucial already 
now. 

When algorithms of the multigrid type with z <1 will be found for gauge 
theories, it will strongly depend on the overhead inflicted, at which system 
size they really become profitable. In simulations of the CP^ model, for 
instance, it has been found that on vector machines the crossover between 
OR and MGMC occurs only for correlation lengths ^ = 20 . . . 30.^^'^* 
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